Using Response Functions for Strategy Training and Evaluation
نویسنده
چکیده
Extensive-form games are a powerful framework for modeling sequential multiagent interactions. In extensive-form games with imperfect information, Nash equilibria are generally used as a solution concept, but computing a Nash equilibrium can be intractable in large games. Instead, a variety of techniques are used to find strategies that approximate Nash equilibria. Traditionally, an approximate Nash equilibrium strategy is evaluated by measuring the strategy’s worst-case performance, or exploitability. However, because exploitability fails to capture how likely the worst-case is to be realized, it provides only a limited picture of strategy strength, and there is extensive empirical evidence showing that exploitability can correlate poorly with one-on-one performance against a variety of opponents. In this thesis, we introduce a class of adaptive opponents called pretty-good responses that exploit a strategy but only have limited exploitative power. By playing a strategy against a variety of counter-strategies created with pretty-good responses, we get a more complete picture of strategy strength than that offered by exploitability alone. In addition, we show how standard no-regret algorithms can me modified to learn strategies that are strong against adaptive opponents. We prove that this technique can produce optimal strategies for playing against pretty-good responses. We empirically demonstrate the effectiveness of the technique by finding static strategies that are strong against Monte Carlo opponents who learn by sampling our strategy, including the UCT Monte Carlo tree search algorithm.
منابع مشابه
Compensatory and Rehabilitative Cognitive Training Improves Executive Functions and Metacognition
Purpose: Executive Functions (EF) improvement is considered as a pivotal axis in cognitive rehabilitation and enhancement according to the studies. Scholars believe that EF can be practiced and improved as a way to ameliorate cognitive ability. The main objective of the present paper is to boost executive functions and meta-cognition via compensatory and rehabilitative cognitive training. Metho...
متن کاملRestoring Motor Functions in Paralyzed Limbs through Intraspinal Multielectrode Microstimulation Using Fuzzy Logic Control and Lag Compensator
In this paper, a control strategy is proposed for control of ankle movement on animals using intraspinal microstimulation (ISMS). The proposed method is based on fuzzy logic control. Fuzzy logic control is a methodology of intelligent control that mimics human decision-making process. This type of control method can be very useful for the complex uncertain systems that their mathematical model ...
متن کاملMixture of Xylose and Glucose Affects Xylitol Production by Pichia guilliermondii: Model Prediction Using Artificial Neural Network
Production of several yeast products occur in presence of mixtures of monosaccharides. To study effect of xylose and glucose mixtures with system aeration and nitrogen source as the other two operative variables on xylitol production by Pichia guilliermondii, the present work was defined. Artificial Neural Network (ANN) strategy was used to athematically show interplay between these three c...
متن کاملEvaluation and comparison of cognitive flexibility, selective attention and response inhibition in male and female bilingual and monolingual students
In different parts of the world, people speak different languages to each other. Some parts of the world are more linguistically rich and more than one language is spoken in those regions. The aim of this study was to evaluate and evaluate the executive functions of the brain including cognitive flexibility, selective attention and response inhibition in monolingual and bilingual male and fem...
متن کاملExamining the effectiveness of initial response training program for nuclear emergency preparedness
Background: Although nuclear technology has various beneficial, it also has a variety of risks. In particular, initial response is very import to respond to risks. Therefore, the program to increase initial response proficiency can be regarded as very essential. The Republic of Korea annually conducts more than 10 nuclear emergency response training programs, and specialized training courses f...
متن کاملEfficient Optimum Design of Steructures With Reqency Response Consteraint Using High Quality Approximation
An efficient technique is presented for optimum design of structures with both natural frequency and complex frequency response constraints. The main ideals to reduce the number of dynamic analysis by introducing high quality approximation. Eigenvalues are approximated using the Rayleigh quotient. Eigenvectors are also approximated for the evaluation of eigenvalues and frequency responses. A tw...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015